Search Results: "anle"

29 April 2014

Russell Coker: Autism and the Treatment of Women Again

Background I ve previously written about the claim that people use Autism as an excuse for bad behavior [1]. In summary it doesn t and such claims instead lead to people not being assessed for Autism. I ve also previously written about empathy and Autism in the context of discussions about conference sexual harassment [2]. The main point is that anyone who s going to blame empathy disorders for the widespread mistreatment of women in society and divert the subject from the actions of average men to men in minority groups isn t demonstrating empathy. Discussions of the actions of average men are so often derailed to cover Autism that the Geek Feminism Wiki has a page about the issue of blaming Autism [3]. The Latest Issue Last year Shanley Kane wrote an informative article for Medium titled What Can Men Do about the treatment of women in the IT industry [4]. It s a good article, I recommend reading it. As an aside @shanley s twitter feed is worth reading [5]. In response to Shanley s article Jeff Atwood wrote an article of the same title this year which covered lots of other things [6]. He writes about Autism but doesn t seem to realise that officially Asperger Syndrome is now Autism according to DSM-V (they decided that separate diagnosis of Autism, Asperger Syndrome, and PDD-NOS were too difficult and merged them). Asperger Syndrome is now a term that refers to historic issues (IE research that was published before DSM-V) and slang use. Gender and the Autism Spectrum Jeff claims that autism skews heavily towards males at a 4:1 ratio and cites the Epidemiology of Autism Wikipedia page as a reference. Firstly that page isn t a great reference, I fixed one major error (which was obviously wrong to anyone who knows anything about Autism and also contradicted the cited reference) in the first section while writing this post. The Wikipedia page cites a PDF about the Epidemiology of Autism that claims the 4.3:1 ratio of boys to girls [7]. However that PDF is a summary of other articles and the one which originated the 4.3:1 claim is behind a paywall. One thing that is worth noting in the PDF is that the section containing the 4.3:1 claim also references claims about correlations between race and Autism and studies contradicting such claims it notes the possibility of ascertainment bias . I think that anyone who reads that section should immediately consider the possibility of ascertainment bias in regard to the gender ratio. Most people who are diagnosed with Autism are diagnosed as children. An Autism diagnosis of a child is quite subjective, an important part is an IQ test (where the psychologist interprets the intent of the child in the many cases where answers aren t clear) to compare social skills with IQ. So whether a child is diagnosed is determined by the psychologist s impression of the child s IQ vs the impression of their social skills. Whether a child is even taken for assessment depends on whether they act in a way that s considered to be obviously different. Any child who is suspected of being on the Autism Spectrum will be compared other children who have been diagnosed (IE mostly boys) and this will probably increase the probability that a boy will be assessed. So an Aspie girl might not be assessed because she acts like other Aspie girls not like the Aspie boys her parents and teachers have seen. The way kids act is not solely determined by neuro-type. Our society expects and encourages boys to be louder than girls and take longer and more frequent turns to speak, this is so widespread that I don t think it s possible for parents to avoid it if their kids are exposed to the outside world. Because of this boys who would be diagnosed with Asperger Syndrome by DSM-IV tend to act in ways that are obviously different from other kids. While the combination of Autism and the the social expectations on girls tends to result in girls who are quiet, shy, and apologetic. The fact that girls are less obviously different and that their differences cause fewer difficulties for parents and teachers makes them less likely to be assessed. Note that the differences in behavior of boys and girls who have been diagnosed is noted by the professionals (and was discussed at a conference on AsperGirls that my wife attended) while the idea that this affects assessment rates is my theory. Jeff also cites the book The Essential Difference: Male And Female Brains And The Truth About Autism by Professor Simon Baron-Cohen (who s (in)famous for his Extreme Male Brain theory). The first thing to note about the Extreme Male Brain theory are that it depends almost entirely on the 4.3:1 ratio of males to females on the Autism Spectrum (which is dubious as I noted above). The only other evidence in support of it is subjective studies of children which suffer from the same cultural issues this is why double blind tests should be used whenever possible. The book Delusions of Gender by Cordelia Fine [8] debunks Simon Baron-Cohen s work among other things. The look inside feature of the Amazon page for Delusions of Gender allows you to read about Simon Baron-Cohen s work [9]. Now even if the Extreme Male Brain theory had any merit it would be a really bad idea to cite it (or a book based on it) if you want to make things better for women in the IT industry. Cordelia s book debunks the science and also shows how such claims about supposed essential difference are taken as exclusionary. The Problem with Jeff Atwood Jeff suggests in his post that men should listen to women. Then he and his followers have a huge flame-war with many women over twitter during which which he tweeted Trying to diversify my follows by following any female voices that engaged me in a civil, constructive way recently . If you only listen to women who agree with you then that doesn t really count as listening to women. When you have a stated policy of only listening to women who agree then it seems to be more about limiting what women may feel free to say around you. The Geek Feminism wiki page about the Tone Argument [10] says the following: One way in which the tone argument frequently manifests itself is as a call for civility. A way to gauge whether a request for civility is sincere or not is to ask whether the person asking for civility has more power along whatever axes are contextually relevant (see Intersectionality) than the person being called incivil , less power, or equal power. Often, people who have the privilege of being listened to and taken seriously level accusations of incivility as a silencing tactic, and label as incivil any speech or behavior that questions their privilege. For example, some men label any feminist thought or speech as hostile or impolite; there is no way for anybody to question male power or privilege without being called rude or aggressive. Likewise, some white people label any critical discussion of race, particularly when initiated by people of color, as incivil. Writing about one topic is also a really good idea. A blog post titled What Can Men Do should be about things that men can do. Not about Autism, speculation about supposed inherent differences between men and women which are based on bad research, gender diversity in various occupations, etc. Following up a post on What Can Men Do with discussion (in blog comments and twitter) about what women should do before they are allowed to join the conversation is ridiculous. Jeff s blog post says that men should listen to women, excluding women based on the tone argument is gross hypocrisy. Swearing Jeff makes a big deal of the fact that Shanley uses some profane language in her tweets. This combines a couple of different ways of silencing women. It s quite common for women to be held to a high standard of ladylike behavior, while men get a free pass on doing the same thing. One example of this is the Geek Feminism article about the results of Sarah Sharp s request for civility in the Linux kernel community [11]. That s not an isolated incident, to the best of my recollection in 20+ years my local Linux Users Group has had only one debate about profanity on mailing lists in that case a woman (who is no longer active in the group) was criticised for using lesser profanity than men used both before and after with no comment (as an experiment I used some gratuitous profanity a couple of weeks later and no-one commented). There is also a common difference in interpretation of expressions of emotion, when a woman seems angry then she invariably has men tell her to change her approach (even when there are obvious reasons for her anger) while when a man is angry the possibility that other people shouldn t make him angry will usually be considered. The issues related to the treatment of women have had a large affect on Shanley s life and her friend s lives. It s quite understandable that she is angry about this. Her use of profanity in tweets seems appropriate to the situation. Other Links Newsweek s Gentlemen in Technology article has a section about Jeff [12], it s interesting to note his history of deleting tweets and editing his post. I presume he will change his post in response to mine and not make any note of the differences. Jacob Kaplan-Moss wrote a good rebuttal to Jeff s post [13]. It s a good article and has some other relevant links that are worth reading.

5 December 2013

Daniel Kahn Gillmor: The legal utility of deniability in secure chat

This Monday, I attended a workshop on Multi-party Off the Record Messaging and Deniability hosted by the Calyx Institute. The discussion was a combination of legal and technical people, looking at how the characteristics of this particular technology affect (or do not affect) the law. This is a report-back, since I know other people wanted to attend. I'm not a lawyer, but I develop software to improve communications security, I care about these questions, and I want other people to be aware of the discussion. I hope I did not misrepresent anything below. I'd be happy if anyone wants to offer corrections. BackgroundOff the Record Messaging (OTR) is a way to secure instant messaging (e.g. jabber/XMPP, gChat, AIM). The two most common characteristics people want from a secure instant messaging program are:
Authentication
Each participant should be able to know specifically who the other parties are on the chat.
Confidentiality
The content of the messages should only be intelligible to the parties involved with the chat; it should appear opaque or encrypted to anyone else listening in. Note that confidentiality effectively depends on authentication -- if you don't know who you're talking to, you can't make sensible assertions about confidentiality.
As with many other modern networked encryption schemes, OTR relies on each user maintaining a long-lived "secret key", and publishing a corresponding "public key" for their peers to examine. These keys are critical for providing authentication (and by extension, for confidentiality). But OTR offers several interesting characteristics beyond the common two. Its most commonly cited characteristics are "forward secrecy" and "deniability".
Forward secrecy
Assuming the parties communicating are operating in good faith, forward secrecy offers protection against a special kind of adversary: one who logs the encrypted chat, and subsequently steals either party's long-term secret key. Without forward secrecy, such an adversary would be able to discover the content of the messages, violating the confidentiality characteristic. With forward secrecy, this adversary is be stymied and the messages remain confidential.
Deniability
Deniability only comes into play when one of the parties is no longer operating in good faith (e.g. their computer is compromised, or they are collaborating with an adversary). In this context, if Alice is chatting with Bob, she does not want Bob to be able to cryptographically prove to anyone else that she made any of the specific statements in the conversation. This is the focus of Monday's discussion. To be clear, this kind of deniability means Alice can correctly say "you have no cryptographic proof I said X", but it does not let her assert "here is cryptographic proof that I did not say X" (I can't think of any protocol that offers the latter assertion). The opposite of deniability is a cryptographic proof of origin, which usually runs something like "only someone with access to Alice's secret key could have said X."
The traditional two-party OTR protocol has offered both forward secrecy and deniability for years. But deniability in particular is a challenging characteristic to provide for group chat which is the domain of Multi-Party OTR (mpOTR). You can read some past discussion about the challenges of deniability in mpOTR (and why it's harder when there are more than two people chatting) from the otr-users mailing list. If you're not doing anything wrong... The discussion was well-anchored by a comment from another participant who cheekily asked "If you're not doing anything wrong, why do you need to hide your chat at all, let alone be able to deny it?" The general sense of the room was that we'd all heard this question many times, from many people. There are lots of problems with the ideas behind the question from many perspectives. But just from a legal perspective, there are at least two problems with the way this question is posed: In these situations, people confront real risk from the law. If we care about these people, we need to figure out if we can build systems to help them reduce that legal risk (of course we also need to fix broken laws, and the legal environment in general, but those approaches were out of scope for this discussion). The Legal Utility of Deniability Monday's meeting was called specifically because it wasn't clear how much real-world usefulness there is in the "deniability" characteristic, and whether this feature is worth the development effort and implementation tradeoffs required. In particular, the group was interested in deniability's utility in legal contexts; many (most?) people in the room were lawyers, and it's also not clear that deniability has much utility outside of a formal legal setting. If your adversary isn't constrained by some rule of law, they probably won't care at all whether there is a cryptographic proof or not that you wrote a particular message (In retrospect, one possible exception is exposure in the media, but we did not discuss that scenario). Places of possible usefulness So where might deniability come in handy during civil litigation or a criminal trial? Presumably the circumstance is that a piece of a chat log is offered as incriminating evidence, and the defendant is trying to deny something that they appear to have said in the log. This denial could take place in two rather different contexts: during rules over admissibility of evidence, or (once admitted) in front of a jury. In legal wrangling over admissibility, apparently a lot of horse-trading can go on -- each side concedes some things in exchange for the other side conceding other things. It appears that cryptographic proof of origin (that is, a lack of deniability) on the chat logs themselves might reduce the amount of leverage a defense lawyer can get from conceding or arguing strongly over that piece of evidence. For example, if the chain of custody of a chat transcript is fuzzy (i.e. the transcript could have been mishandled or modified somehow before reaching trial), then a cryptographic proof of origin would make it much harder for the defense to contest the chat transcript on the grounds of tampering. Deniability would give the defense more bargaining power. In arguing about already-admitted evidence before a jury, deniability in this sense seems like a job for expert witnesses, who would need to convince the jury of their interpretation of the data. There was a lot of skepticism in the room over this, both around the possibility of most jurors really understanding what OTR's claim of deniability actually means, and on jurors' ability to distinguish this argument from a bogus argument presented by an opposing expert witness who is willing to lie about the nature of the protocol (or who misunderstands it and passes on their misunderstanding to the jury). The complexity of the tech systems involved in a data-heavy prosecution or civil litigation are themselves opportunities for lawyers to argue (and experts to weigh in) on the general reliability of these systems. Sifting through the quantities of data available and ensuring that the appropriate evidence is actually findable, relevant, and suitably preserved for the jury's inspection is a hard and complicated job, with room for error. OTR's deniability might be one more element in a multi-pronged attack on these data systems. These are the most compelling arguments for the legal utility of deniability that I took away from the discussion. I confess that they don't seem particularly strong to me, though some level of "avoiding a weaker position when horse-trading" resonates with me. What about the arguments against its utility? Limitations The most basic argument against OTR's deniability is that courts don't care about cryptographic proof for digital evidence. People are convicted or lose civil cases based on unsigned electronic communications (e.g. normal e-mail, plain chat logs) all the time. OTR's deniability doesn't provide any legal cover stronger than trying to claim you didn't write a given e-mail that appears to have originated from your account. As someone who understands the forgeability of e-mail, i find this overall situation troubling, but it seems to be where we are. Worse, OTR's deniability doesn't cover whether you had a conversation, just what you said in that conversation. That is, Bob can still cryptographically prove to an adversary (or before a judge or jury) that he had a communication with someone controlling Alice's secret key (which is probably Alice); he just can't prove that Alice herself said any particular part of the conversation he produces. Additionally, there are runtime tradeoffs depending on how the protocol manages to achieve these features. For example, forward secrecy itself requires an additional round trip or two when compared to authenticated, encrypted communications without forward secrecy (a "round trip" is a message from Alice to Bob followed by a message back from Bob to Alice). Getting proper deniability into the mpOTR spec might incur extra latency (imagine having to wait 60 seconds after everyone joins before starting a group chat, or a pause in the chat of 15 seconds when a new member joins) or extra computational power (meaning that they might not work well on slower/older devices) or an order of magnitude more bandwidth (meaning that chat might not work at all on a weak connection). There could also simply be complexity that makes it harder to correctly implement a protocol with deniability than an alternate protocol without deniability. Incorrectly-implemented software can put its users at risk. I don't know enough about the current state of mpOTR to know what the specific tradeoffs are for the deniability feature, but it's clear there will be some. Who decides whether the tradeoffs are worth the feature? Other kinds of deniability Further weakening the case for the legal utility of OTR's deniability, there seem to be other ways to get deniability in a legal context over a chat transcript. There are deniability arguments that can be made from outside the protocol. For example, you can always claim someone else took control of your computer while you were asleep or using the bathroom or eating dinner, or you can claim that your computer had a virus that exported your secret key and it must have been used by someone else. If you're desperate enough to sacrifice your digital identity, you could arrange to have your secret key published, at which point anyone can make signed statements with it. Having forward secrecy makes it possible to expose your secret key without exposing the content of your past communications to any listener who happened to log them. Conclusion My takeaway from the discussion is that the legal utility of OTR's deniability is non-zero, but quite low; and that development energy focused on deniability is probably only justified if there are very few costs associated with it. Several folks pointed out that most communications-security tools are too complicated or inconvenient to use for normal people. If we have limited development energy to spend on securing instant messaging, usability and ubiquity would be a better focus than this form of deniability. Secure chat systems that take too long to make, that are too complex, or that are too cumbersome are not going to be adopted. But this doesn't mean people won't chat at all -- they'll just use cleartext chat, or maybe they'll use supposedly "secure" protocols with even worse properties: for example, without proper end-to-end authentication (permitting spoofing or impersonation by the server operator or potentially by anyone else); with encryption that is reversible by the chatroom operator or flawed enough to be reversed by any listener with a powerful computer; without forward secrecy; or so on. As a demonstration of this, we heard some lawyers in the room admit to using Skype to talk with their clients even though they know it's not a safe communications channel because their clients' adversaries might have access to the skype messaging system itself. My conclusion from the meeting is that there are a few particular situations where deniability could be useful legally, but that overall, it is not where we as a community should be spending our development energy. Perhaps in some future world where all communications are already authenticated, encrypted, and forward-secret by default, we can look into improving our protocols to provide this characteristic, but for now, we really need to work on usability, popularization, and wide deployment. Thanks Many thanks to Nick Merrill for organizing the discussion, to Shayana Kadidal and Stanley Cohen for providing a wealth of legal insight and legal experience, to Tom Ritter for an excellent presentation of the technical details, and to everyone in the group who participated in the interesting and lively discussion. Tags: chat, deniability, otr, security

30 November 2013

Russell Coker: Links November 2013

Shanley wrote an insightful article about microagressions and management [1]. It s interesting to read that and think of past work experiences, even the best managers do it. Bill Stone gave an inspiring TED talk about exploring huge caves, autonamous probes to explore underground lakes (which can be used on Europa) and building a refuelling station on the Moon [2]. Simon Lewis gave an interesting TED talk about consciousness and the technology needed to help him recover from injuries sustained in a serious car crash [3]. Paul Wayper wrote an interesting article about reforming the patent system [4]. He also notes that the patent system is claimed to be protecting the mythical home inventor when it s really about patent trolls (and ex-inventors who work for them). This is similar to the way that ex-musicians work for organisations that promote extreme copyright legislation. Amanda Palmer gave an interesting TED talk about asking for donations/assistance, and the interactions between musicians and the audience [5]. Some part of this are NSFW. Hans Rakers wrote a useful post about how to solve a Dovecot problem with too many files open [6]. His solution was for a Red Hat based system, for Debian you can do the same but by editing /etc/init.d/dovecot. The use of the /proc/N/limits file was interesting, I ve never had a cause to deliberately use that file before. Krebs on Security has an interesting article about Android malware being used to defeat SMS systems to prevent bank fraud [7]. Apparently an infected PC will instruct the user to install an Android app to complete the process. Rick Falkvinge wrote an interesting article about how to apply basic economics terminology to so-called Intellectual Property [8]. Matthew Garrett wrote an interesting post about the way that Ubuntu gets a better result than Debian and Fedora because it has clear fixed goals [9]. He states that many people regard Fedora as a playground to produce a range of niche derivatives , probably a large portion of the Fedora and Debian developers consider this a feature not a bug. Ming Thein wrote an interesting article about the demise of the DSLR [10]. Bruce Schneier wrote an interesting post on the detention of David Miranda by the British authorities [11]. It s mostly speculation as to why they would do such a thing (which seems to go against their own best interests) and whether the NSA even knows which documents Edward Snowden copied. Jaclyn Friedman wrote an interesting article on Mens Rights Movements (MRAs) and how they are bad for MEN as well as for women [12]. Rodney S. Tucker wrote an insightful article for the IEEE about the NBN [13]. Basically the Liberal party are going to spend most of the tax money needed for a full NBN but get a significantly less than the full benefit. Lauren Drell wrote an interesting article for Mashable about TellSpec, a portable spectrometer that communicates with an Android phone to analyse food for allergens [14]. I guess this will stop schools from banning phones. Katie McDonough wrote an interesting article for Salon about the Pope s statements about the problems with unchecked capitalism [15]. His ideas are really nothing new to anyone who has read the Bible and read the news. It seems to me that the most newsworthy part of this is that most Christian leaders don t make similar statements. Daniel Leidert wrote an interesting post about power saving when running Debian on a HP Microserver [16]. Most of it is relevant to other AMD64 hardware too, I ll have to investigate the PCIE ASPM and spin down options on some of my systems that are mostly idle.

16 November 2013

Russ Allbery: Review: Blue Remembered Earth

Review: Blue Remembered Earth, by Alastair Reynolds
Series: Poseidon's Children #1
Publisher: Ace
Copyright: January 2012
Printing: June 2012
ISBN: 0-441-02071-2
Format: Hardcover
Pages: 505
Geoffrey Akinya is one heir to the vast Akinya business empire, which straddles the solar system and has created vast riches from exploration and mining. He wants nothing to do with it. His passion is elephants: long and methodical study of wild herds in Africa, including a slow and careful investigation into their thought processes. His obnoxious and superior cousins have a passion for business and finance and can have the running of the company for all he cares. But Geoffrey's grandmother Eunice, the family matriarch and driving force behind much of their business expansion, has died. Her ashes are brought back to Africa from the Winter Palace in orbit around the Moon where she lived out her final days and scattered in a family ceremony. And, shortly thereafter, Geoffrey's cousins convince him to investigate a safe-deposit box left on the Moon by Eunice to ensure that it doesn't contain anything damaging to the family. What it contains is the first step in a puzzle. Despite himself, Geoffrey and his sister Sunday an artist and family black sheep who lives in a region of the Moon that is one of the sole holdouts against the ubiquitous monitoring common in the rest of the inhabited solar system are slowly pulled into unraveling that puzzle. This leads them into an uneasy alliance with one of the major political forces in the solar system and some startling revelations about their grandmother's actual plans. Blue Remembered Earth is another entry in the currently-popular sub-genre of solar system SF. Interstellar travel is still a dream, but a combination of improved technology, suspended animation, and heavy use of robots and robotic factories has let humans slowly expand into more of the solar system. On Earth, powers have risen and fallen, and the leap into the solar system coincided with a surging and powerful Africa. Extensive undersea colonization has established a new international consortium of sea-floor countries, and Earth politics have realigned into a diplomatic and business struggle between surface and oceanic nations. Unusually for this sort of fiction, most characters of significance in this story are African. The United States and Europe have quietly disappeared in the way that all non-US countries tend to in most other SF. Reynolds doesn't do very much with this setup (and the characters don't seem distinctly African to me). It's just there as the unremarked normal. I liked that. I kind of wish more of African culture had come through in the story, but there are enough SF novels with an assumed and unremarked white (and usually US) future that an assumed and unremarked black African culture is a nice change of pace. I picked up this book in part because it was described as a better 2312. That's a fairly accurate summary (although only helpful to those who have read Kim Stanley Robinson's novel). The scope and setting are similar, although Reynolds's future cuts back on the scope of off-world human colonization, relies more heavily on robotics, and keeps the power base on Earth. Both books are structured as a mystery, although in Blue Remembered Earth that mystery is deliberately planted for the characters. Reynolds is nowhere near as good at set pieces as Robinson, sadly. There's nothing here at the level of some of the scenes on Mercury in 2312. But he is much better at plotting and characterization, and Blue Remembered Earth tells a coherent story with increasing tension and a satisfying ending. Reynolds here uses an interesting strategy for spinning a mystery out of pieces of knowledge and discovery: set the discovery process prior to the novel, use protagonists who were unaware of it, and turn the details into a mystery that they have to investigate. This is an interesting way to address the problem that most discovery processes are slow and not particularly dramatic. It is artificial Eunice creates the plot for the novel but that worked for me. By the end of the book, I think I understand her reasons for doing so, and both Geoffrey and Sunday have good reasons to not be aware of any of the details before they start following the clues that Eunice leaves behind. The mystery of course sends Geoffrey and Sunday on a bit of a tour of the solar system as well as a tour of the past of their family business and the politics of the present. Here, Reynolds draws on an old SF idea: a conflict between groups that want to mechanize space exploration and groups that want to adapt humans for space. Long-time SF readers will be familiar with this approach, and I don't think Reynolds does anything startlingly new with it, but I thought it was a well-executed side plot. And, for once, an SF author realizes that Earth's oceans are as compelling of a place for that conflict to play out as outer space. There are, of course, significant mysteries to be uncovered, but it takes much of the book before those revelations start (even if one of them is almost spoiled in the dust jacket summary). The story up until that point is driven by character, particularly Geoffrey's stubborn annoyance at the rest of his family and his complete lack of interest in being the protagonist of a novel. I should probably warn here that many of the Amazon reviews disliked this book because of the characters, particularly Geoffrey. Apparently he comes across to some as whiny, resentful, and annoying. I, however, didn't have this reaction at all, possibly in part because I wholeheartedly share his opinion of both big business and his cousins. I thoroughly enjoyed reading an SF novel whose main characters are not driven either typical capitalist motives or a desire for personal exploration, but instead are pulled into the plot through a sense of loyalty and an inability to get out of the way of the problem. I also liked the way Reynolds manages to broaden and deepen the reader's appreciation for the characters as the story unfolds, and even salvages some characters I thought were unsalvageable. The last third of the book features some excellent plot twists and some very nicely-handled revelations. This is apparently the first book of a trilogy, and I had a suspicion at the end of the book that sequels were coming, but I think it stands on its own quite well. Most of the questions raised by the plot are resolved by the end of the book, and those that remain are of a size that other books have left unanswered intentionally. Despite not reaching the same moments of awe-inspiring beauty, and despite more workman-like descriptions, the stronger plot and more interesting characters do make Blue Remembered Earth a better 2312 for me. Given that 2312 won a Nebula, that's saying something (although I don't agree with that award). It's a bit deliberate in its pacing, and the plot built around a constructed puzzle can feel a bit artificial, but I think it's a solid and satisfying near-space SF novel. Followed by On the Steel Breeze. Rating: 7 out of 10

10 October 2013

Russ Allbery: Review: 2312

Review: 2312, by Kim Stanley Robinson
Publisher: Orbit
Copyright: 2012
ISBN: 0-316-19280-5
Format: Kindle
Pages: 563
Swan Er Hong is a native of Mercury, granddaughter of one of the most influential women in the solar system. Alex, known as the Lion of Mercury, has just died as the story opens, and Swan is mourning. But she's also discovering that something else has been going on, something that Alex had not confided in her. Something that involves an inspector from the asteroids, a scientist from Titan, and a message Alex left in the event of her death, asking Swan to convey something personally to another scientist on Io. And then Mercury is attacked and the situation becomes much more complicated. 2312 is, like most Robinson novels, built on top of monumental world-building. The date is given away by the title; the setting is a solar system full of terraforming, habitats, and explosive human colonization. There are human settlements on every world that could possibly sustain them, including a terraformed Mars and a vast in-progress terraforming project on Venus. Mercury, Song's home, supports the city known as Terminator, a shielded habitat that travels around the planet on rails, pushed forward by the thermal expansion of metal at sunrise. Humans have scattered across the solar system and changed everything they've touched, even the asteroids. Many of those have been repurposed as long-haul shuttles: self-contained habitats on carefully-designed routes on which people live for months or years until they get to their destinations. Parts of this book are gorgeous. One of Robinson's strengths is creating images of the vastness of space, science, and human endeavor that feel plausible and inspiring at the same time. The setting of 2312 frees him to go wild with mega construction projects, and he takes full advantage. Parts of the book inspire the same sense of awe as the Hoover Dam or the Great Wall of China, transplanted into space and backed with neat scientific ideas to ponder. (Even if the more skeptical will note certain human habitation hazards, such as prolonged weightlessness and radiation, are mostly hand-waved away.) Unusually for a Robinson novel, 2312 also features an interesting and vibrant protagonist. Usually I can barely remember Robinson's protagonists five minutes after I finish the book, but Swan is so impulsive, erratic, and emotionally intense that I doubt I'll forget her. Her past is stuffed with outrageous risks and intense creativity: having songbird neurons transplanted into her brain, ingesting extraterrestrial bacteria, designing habitats in the asteroid belt, and creating art on the plains of Mercury. As the story gets underway, she meets the oversized and mellow Wahram; at first, she finds him boring, but then they're thrown together by one of the best set pieces of the book, and both they and the reader discover they work well together. Wahram serves as a straight man and as a stand-in for the reader's opinion, showing Swan all the more clearly in contrast. All this sounds quite promising. Unfortunately, what makes 2312 an ambitious failure rather than an excellent story is the story part. Robinson has built an amazing world, but he doesn't seem to know what to do with it. His economy, at least outside of Earth (and more on that in a moment), is essentially post-scarcity. Few of the characters we meet seem to have any concept of fiscal or practical limitation. They head across the solar system on a moment's notice, take vacations in habitats, or head to favorite spots for the view. It felt like Robinson was trying to write a Culture story. Unlike Banks, however, he struggles with generating a plot. 2312 is, in structure, a murder mystery. The characters witness a crime, survive it, and then intend to investigate it, and the crime appears to grow and link to fault lines in their society. This could have worked, but it has two problems: their society is only barely coherent enough for the reader to perceive the potential fault lines, and Robinson seems to have no idea how to construct a mystery plot. There is no sense of coherence to Swan and Wahram's approach, no sense of a gathering of clues, and very little sense of drama. All of the plot revelations are either dropped in Swan's lap by another character at a convenient moment or generated by Swan refusing to take the plot of the book seriously. The characters do essentially no meaningful investigation on-camera and show almost no investment in the outcome of the plot. When the climax comes, it's all weirdly forgettable. There are also large sections of the book that appear to have nothing to do with the rest of the plot, and that's where the unfortunate interval on Earth comes in. Robinson takes advantage of the scenes on Earth to do a bit of alienation and shows how foreign and strange and stifling Earth feels to someone who grew up outside of its atmosphere. Parts of this work, but he puts the plot on hold to do it. And parts of it do not work at all. The most glaring example is Swan and Wahram's bizarre bit of attempted charity in Africa, which comes across as stunningly high-handed and arrogant. This could be in character, particularly for Swan (who is not long on empathy), but, if so, the book doesn't signal to the reader that it should be read that way. Instead, there are some side (or snide) comments that seem to indicate Robinson knows nothing about the economic arc of Africa from the past twenty years. And when their absurd, botched, condescending charity plan fails for all the obvious reasons, the characters, and apparently the novel, throw up their hands and write Earth off as a stagnant lost cause that can't accept the imposition of a good idea and go back to the plot, never apparently caring about Earth again. Almost as frustrating is the way that these interludes are tied back into the story, which is usually through Swan getting ridiculously lucky on her random encounter rolls. It felt like whenever Robinson needed to make progress in the plot, Swan would just accidentally run into exactly the right person or situation to bring up the next plot point or to have some investigation make sense. (Not that Swan usually figured this out. Normally, the inspector explains it to her.) The author's finger was planted so firmly on the scales that it destroyed my suspension of disbelief and made a mockery of the idea that the characters were actually investigating anything. 2312 is built around a skeleton of a plot, but the lack of engagement with it, the lack of tension and emotion, the way the next developments are generally narrated to the protagonists and the reader, and the repeated use of random encounters to steer it left me without much reason to care. Robinson tries a few twists, but since the story never felt committed to its plot anyway, those twists feel less like planned complications and more like another random veer in the road. It didn't help that the final outcome was more prosaic and forgettable than the book had been implying it would be. There is a lot to like here. I'm very pleased to see Robinson finally write a memorable protagonist, and he's very good at both world-building and set pieces. But it needed a plot, and it needed a more coherent and complete cultural backdrop for its characters. Without both, it fell back into the typical Robinson trap: gorgeous moments separated by a whole lot of boring, and an overall impression of a construction tour rather than a story. There are bits of it I loved and still remember, particularly Song and Wahram in the tunnels of Mercury (which is possibly the best extended characterization Robinson has ever written). But the book as a whole is a mess, and I can't recommend wading through it for the good parts. Rating: 5 out of 10

15 August 2013

Daniel Leidert: HP N54L Microserver - energy efficiency and power management

I recently worked on activating power management functions, reduce energy consumption and noise of my little HP N54L "toy". During this process I tried to avoid the usage of /etc/rc.local and set things by udev, hdparm and friends. Below are my results. Actual resultsWith the following steps my system (N54L + 3xWD20EFRX HDD +1xWD5003AZEX HDD + LCD-mod + case fan mod + Debian Wheezy) uses 27W in idle mode. The USB W-LAN card uses another 10W. In active mode, e.g. compiling source code, the system runs (and boots) with around 57W. The highest power consumption observed is during startup phase with 88W. First things firstFor the following steps it might be necessary to have some packages installed, that maybe do not occur in this post. If I missed something, I appreciate a hint. Further the following steps might produce even better results with a custom kernel. I'm using the stock linux-image-3.2.0-4-amd64 kernel image as the time of writing and I have these packages installed: amd64-microcode, firmware-linux, firmware-linux-free, firmware-linux-nonfree and firmware-atheros (the latter for my WLAN card). ASPM and ACPIFirst I enabled PCIE ASPM in my (non-modded) BIOS and forced it together with ACPI via grub by changing GRUB_CMDLINE_LINUX_DEFAULT in /etc/default/grub so it looks like this:

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash acpi=force pcie_aspm=force nmi_watchdog=0"
ASPM has now been enabled as lspci prooves:

00:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780/RS880 PCI to PCI bridge (PCIE port 0) (prog-if 00 [Normal decode])
[..]
LnkCap: Port #1, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <64ns, L1 <1us
ClockPM- Surprise- LLActRep+ BwNot+
LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
[..]
00:06.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] RS780 PCI to PCI bridge (PCIE port 2) (prog-if 00 [Normal decode])
[..]
LnkCap: Port #3, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <64ns, L1 <1us
ClockPM- Surprise- LLActRep+ BwNot+
LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
02:00.0 USB controller: Renesas Technology Corp. uPD720201 USB 3.0 Host Controller (rev 03) (prog-if 30 [XHCI])
[..]
LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s L1, Latency L0 <4us, L1 unlimited
ClockPM+ Surprise- LLActRep- BwNot-
LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
[..]
03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5723 Gigabit Ethernet PCIe (rev 10)
[..]
LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 <1us, L1 <64us
ClockPM+ Surprise- LLActRep- BwNot-
LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
[..]
Even so /sys/module/pcie_aspm/parameters/policy will still show as below:

[default] performance powersave
I'll show how to set the powersave value in /sys/module/pcie_aspm/parameters/policy in the next section.JFTR: These are my ACPI related packages installed: acpi, acpid, acpi-support and acpi-support-base. Enable powersaving via UDEVThe following rules file /etc/udev/rules.d/90-local-n54l.rules has been inspired by a blog post. It enables powersaving modes for all PCI, SCSI and USB devices and ASPM. Further the internal RADEON cards power profile is set to the low value. There is no monitor connected usually. The file contains these rules:

SUBSYSTEM=="module", KERNEL=="pcie_aspm", ACTION=="add", TEST=="parameters/policy", ATTR parameters/policy ="powersave"

SUBSYSTEM=="i2c", ACTION=="add", TEST=="power/control", ATTR power/control ="auto"
SUBSYSTEM=="pci", ACTION=="add", TEST=="power/control", ATTR power/control ="auto"
SUBSYSTEM=="usb", ACTION=="add", TEST=="power/control", ATTR power/control ="auto"
SUBSYSTEM=="usb", ACTION=="add", TEST=="power/autosuspend", ATTR power/autosuspend ="2"
SUBSYSTEM=="scsi", ACTION=="add", TEST=="power/control", ATTR power/control ="auto"
SUBSYSTEM=="spi", ACTION=="add", TEST=="power/control", ATTR power/control ="auto"

SUBSYSTEM=="drm", KERNEL=="card*", ACTION=="add", DRIVERS=="radeon", TEST=="power/control", TEST=="device/power_method", ATTR device/power_method ="profile", ATTR device/power_profile ="low"

SUBSYSTEM=="scsi_host", KERNEL=="host*", ACTION=="add", TEST=="link_power_management_policy", ATTR link_power_management_policy ="min_power"
Set harddrives spindown timeoutI decided to sent my system drive to standby after 20 minutes and the RAID drives after 15 minutes. This is usually ok, because the RAID isn't always used. hdparm is the right tool to realize this. Many people use the /dev/disk/by-uuid/... syntax here, to avoid having to touch the configuration file if some system configuration changes. Because I'm running a RAID, I couldn't use this syntax, although it might be possible to use /dev/disk/by-id/... instead. Well for the moment I stay with the configuration below. The relevant part of /etc/hdparm.conf is:

[..]

# system harddrive
/dev/sda
spindown_time = 240


# below are the WD20EFRX drives
/dev/sdb
spindown_time = 180


/dev/sdc
spindown_time = 180


/dev/sdd
spindown_time = 180

Idle modeWhen there is nothing to do for the system, all I hear is the (still a bit noisy) fan of the power supply, which I might replace in the future too. Either by testing a different fan or by replacing the whole power supply unit by the fanless FORTRON FSP150-50TNF or (even better) a picoPSU.The system currently shows a power consumption of around 37W in idle mode whereas the USB W-LAN card itself needs around 10W. There is a possibility to enable power savings mode for this card too. I could add this entry to /etc/udev/rules.d/90-local-n54l.rules:

SUBSYSTEM=="net", ACTION=="add", KERNEL=="wlan*" RUN+="/usr/bin/iw dev %k set power_save on"
But it turned out that the connection became a bit unstable after it. So I don't use this rule. More on the roadThere are a lot more options one can easily find via $search_engine. The N54L system could be brought to sleep and woken up by LAN via Wake-on-LAN (WOL). This is a feature I don't use. I've also read rumors about enabling different sleep/suspend states of the system, which seems to require to install a modded BIOS. Well, I'll post news and changes if they happen to come;)

28 May 2013

Russ Allbery: Collected haul

I've been slow lately in writing these up (and, for that matter, in doing most other things related to reading; things have been rather busy lately). This is a bunch of here-and-there purchases over the last few months, including Powell's Indiespensible shipments. Sandra Barret Face of the Enemy (sff)
Anne Bishop Written in Red (sff)
Lois McMaster Bujold Captain Vorpatril's Alliance (sff)
Cary Caffrey The Girls from Alcyone (sff)
Jenni Fagan The Panopticon (mainstream)
Niels Ferguson, et al. Cryptography Engineering (non-fiction)
Jen Kirchner The Fourth Channel (sff)
Anothony Marra A Constellation of Vital Phenomena (mainstream)
Steve McConnell Code Complete (non-fiction)
Seanan McGuire Velveteen vs. The Junior Super-Patriots (sff)
Patrick Nielsen Hayden, et al. (ed.) Some of the Best from Tor.com: 2012 (sff anthology)
Lisa O'Donnell The Death of Bees (mainstream)
Susan Palwich Flying in Place (sff)
Kim Stanley Robinson 2312 (sff)
John Scalzi Redshirts (sff)
Ian Tregillis Bitter Seeds (sff)
Leon Trotsky The History of the Russian Revolution (non-fiction)
Simon Van Booy The Illusion of Separateness (mainstream)
Chris Anne Wolfe Shadows of Aggar (sff)
Barbara Ann Wright The Pyramid Waltz (sff) That's a lot of stuff. It includes a couple of non-fiction O'Reilly books from sales, a few months of Powell's Indiespensible subscriptions, a variety of books I picked up after a discussion of good lesbian fiction on Tor.com (romance without the obnoxious gender tropes, or at least as many of them), and the rest of the Hugo nominees for the year. I got a ton of reading done earlier this month. I wish I could say the same thing about reviews, but I only wrote a few. That's something that I want to try to catch up on soon, so there will probably be a flurry of those posted soon. I've already read Blackout and Redshirts of this year's nominees (a review of the latter is coming), so at least I'm not too far behind on the reading. Throne of the Crescent Moon is in progress now.

4 January 2013

Jan Wagner: New blogging engine

You may have noticed that I recently started posting more updates again. The reason is, I switched over from Wordpress to Octopress as blogging engine. The idea was driven, cause my used theme K2 got stuck and with the upcoming release of Debian wheezy I'm forced to switch to a more recent Wordpress release, which is likely incompatible.
Another reason is, that I got bored by wordpress itself (and it's software dependencies). With octopress these dependencies are lowered to a webserver which can server static files and rsync on the server side. Maybe I will post some parts of the story, what I did when migrating the content and what components (plugins, theme ...) I'm using, later.

24 November 2012

Russ Allbery: Review: Dune

Review: Dune, by Frank Herbert
Series: Dune #1
Publisher: Ace
Copyright: 1965
Printing: September 1990
ISBN: 0-441-17271-7
Format: Mass market
Pages: 537
The Atreides family, distant cousins to the imperial family, have ruled the planet of Caladan for twenty generations. Caladan is a wet farming world, comfortable and pleasant, but not horribly important. But House Atreides is feuding with House Harkonnen, and, at the start of Dune, that feud maneuvers Duke Leto into giving up his holdings and moving his family to take possession of Arrakis. Arrakis is a desert planet, previously controlled by Baron Harkonnen. It is unrelentingly hostile, home to smugglers and dangerous local desert dwellers called Fremen. But it's also one of the most important planets in the galaxy, since it's the sole origin of the chemical called melange, or spice. Spice permits a limited form of prescience, which allows the navigators of the Spacing Guild to successfully steer ships across the interstellar void. Arrakis's production of spice is what makes interstellar travel, and therefore all of interstellar civilization, possible. Dune is the story of Paul Atreides, son and heir to Duke Leto Atreides. His mother, Lady Jessica, is one of the Bene Gesserit, a secretive order of women devoted to mental and physical discipline and to the long-term genetic improvement of mankind. He is not supposed to exist; Lady Jessica was supposed to only bear a daughter of Leto. But he may be something special, the long-sought (but also dangerous) Kwisatz Haderach who can unite male and female Bene Gesserit powers. The Bene Gesserit take great interest in him from the start of Dune. More surprisingly, so do the Fremen of Arrakis; from the moment he arrives there, he seems to be fulfilling prophecies of theirs that are partly, but not entirely, ones planted by the Bene Gesserit long ago. The feud with the Harkonnens, the unstable place of Arrakis in galactic politics, the dreams of the Fremen and the Imperial ecologist on Arrakis of terraforming, Bene Gesserit plans, Paul's abilities, and the legends of the Fremen all combine in a complex mix of politics, battle, and clashes of culture. Dune is an acknowledged SF masterpiece, one of the best-known classics of the genre. It's usually found in short lists of the best SF novels ever written. It spawned five sequels by Frank Herbert (about which more in a moment), as well as numerous additional sequels and prequels by Kevin J. Anderson and Brian Herbert. It's been adopted for the screen twice, not to mention board games, video games, and numerous other projects. This is my second reading, the first in about twenty years, but the story was still immediately familiar from having seen films and having discussed and read about the universe. This is not science fiction in any strict sense. Dune is science fiction in the same way that Star Wars is: a futuristic gloss on top of power structures inspired by feudalism, heavily mixed with mysticism, mental powers, magic, and implausible but convenient science that creates the story effects the author wants. Both Bene Gesserit powers in general and Paul's abilities in particular are effectively magic. There is some hand-waving explanation of their ability to verbally control other people as taking advantage of specific pitches and intonations that people are vulnerable to, but it's effectively spell-casting (and is a direct inspiration for Jedi mind tricks). All of the mysticism (and there's quite a lot of it in Dune, including race memory, precognition, and even molecular transformation) resembles the Force from Star Wars more than anything scientific. Dune is epic fantasy told on a science fiction stage, complete with a young protagonist coming into his powers and dangerous and sometimes hostile mentors. What Dune gets right, and what has put it so high in the pantheon of great science fiction, is the world building. Herbert sets the story tens of thousands of years into the future of humanity and then effectively projects the feeling of deep history over everything in the novel. This is the kind of book that has appendices with more background information; more to the point, it's the kind of book where you may actually read them out of curiosity. Mankind has a vast interstellar empire (Herbert's universe, like Asimov's Foundation universe, admits no aliens) governed by a system akin to the early British monarchy. An emperor rules in balance with the Great Houses, who meet in a sort of parliament. But against both is a third force: the Spacing Guild, who maintains a monopoly over all interstellar travel. (And the Bene Gesserit form an underground, secretive fourth power base.) Herbert plays with vast swaths of time and great forces of history as well as very good epic fantasy and better than nearly all SF I've read. The detailed world-building is equally good. Nearly all of Dune takes place on the desert planet of Arrakis, which has a lovingly-described ecology and local culture built entirely around scarcity of water. (The details of that ecology are much of the plot and mystery of the book, so I won't spoil them further.) While I doubt the precise details hold up to close scientific scrutiny, this is an obvious precursor to the great ecological stories of later SF, such as Kim Stanley Robinson's Mars trilogy. The details all feel right and hang together in satisfying ways, while also generating the great Sand Worms of Arrakis, a key ingredient in several of the best set pieces in the history of SF. This is the sort of book where the fascinating details and discoveries about the world do as much to keep one turning the pages as the plot, although the plot is also satisfyingly twisty and tense. Unfortunately, Dune doesn't get everything right. The amount of mysticism involved is a bit much, and at times the drug-trip mystical experiences of viewpoint characters turn into excessively purple prose and nearly incomprehensible descriptions. Those mystical experiences also involve race and genetic memory, a concept that's just scientific enough to be unbelievable. A few of the other scentific cheats are also rather blantant; for example, Herbert constructs an elaborate, artificial technology of shielding that seems designed primarily as an excuse to add sword combat to a futuristic story, and I have always struggled to suspend disbelief about the way lasers and shields interact in Dune. The Spacer Guild's monopoly on interstellar travel can be explained; their monopoly on local orbital space, or even the high stratosphere, both vital to allow certain things on Arrakis to remain secret, are much more dubious. Herbert mostly doesn't try to explain these things, and as with Star Wars the less explained the cheats are, the better they work as part of the story. But the technological background doesn't hold up against much examination. Worse, for me, is the general quality of the writing. Herbert does some things very well, such as world-building, and avoids awkward infodumps. Characterization and pacing are both fairly solid; he does a good job with Paul and Jessica in particular, and I've always liked the Fremen. But he wants to put the reader in everyone's head, frequently by giving character thoughts as italicized dialogue, and to enable that he uses a perspective that I always find distracting. Most fiction is written in tight third person. This means that the viewpoint character for any given section of the book is referred to in the third person, like all the other characters, but the reader has special access to their thoughts and emotions. We get to know what they're really thinking and feeling, not just the impressions they give to others, while the non-viewpoint characters are shown only from external appearances and the thoughts of the viewpoint character. Some books hold to the same viewpoint character throughout, but more commonly books move between viewpoint characters at scene breaks to provide more angles on the book's events. First person, in which the story is told by a specific character as if they were telling a story or writing it down, is the most common alternative. Third person objective, in which we don't get any special insight into the internal thoughts of any of the characters, is less common but still unsurprising. Dune does not use any of those perspectives. Instead, Dune uses wandering third-person omniscient, in which we get the inner thoughts and emotions of a character in a scene and then a few lines later the inner thoughts and emotions of a different character. This is the sort of thing that may or may not bug you depending on how much you've read, how deep the expectations of perspective are ingrained, and how much you notice perspective. It drives me nuts. I subconsciously align with the viewpoint character of a section, and pay attention to the ways that authors indicate which character will be the viewpoint character at the start of a scene. Herbert's constant flitting from character to character makes me dizzy. We get the verbatim thoughts of everyone almost indiscriminately, making me feel like I'm randomly hopscotching through the scene. For me, this does two things: it hurts my ability to get engrossed in the story, since I'm constantly thrown out of my normal reading mode when the viewpoint unexpectedly shifts, and it makes the writing feel repetitive. One keeps hearing about the same thing from multiple perspectives, and at times the story bogs down in everyone's internal dialogues rather than showing character reactions and letting the reader draw their own conclusions. I think it tries for a cinematic perspective, but ends up making the story feel muddled. The other flaw, which I didn't notice originally but which leaped out at me during this re-read, is that Herbert's world-building uses quite a few stereotypes. The most notorious, and most widely discussed, is of course the Fremen. Herbert draws heavily on Arab and Islamic culture even beyond the obvious similarities of people living in a harsh, arid climate. He borrows some rather loaded terms and cultural markers, such as jihad, to construct a culture of potential religious fanatics. This is not all bad; the Fremen are clearly portrayed as the good guys, which is a refreshing change from more typical current portrayals of Islam. But it becomes clear that they have aligned their entire culture around influences from outside, and the whole plot of Dune can be fairly characterized as an instance of "what these people need is a white man." Paul (and Kynes before him) joins their culture as well, but Paul becomes a better native than the natives, while simultaneously bringing his outside perspective. It's the sort of plot that is more widely noticed today than it would have been in 1965. Another major example of this, and one that I found more blatant, is that Herbert turns the Harkonnen into hissable, one-sided villains and uses some nasty stereotypes to do it. The insane torturer is consistently and repeatedly described as effeminate, fat is used as a marker of moral inferiority and evil, and the primary villain is homosexual and prefers drugged young male slaves. Here too, this sort of characterization short-cut was more common in 1965, but it's not appealing and makes the (already rather camp) scenes set among the Harkonnen even less enjoyable. Less clear-cut is the way women are handled throughout Dune. I do have to give Herbert some credit, particularly for the era in which he was writing. There are powerful female characters in Dune, including both Jessica and Alia, who have their own independent power and successfully pursue their own agendas throughout. The effectively all-female Bene Gesserit is a major political power in the story and is treated by the other players with respect as well as fear. But it's hard not to also notice the general position of women as subservient to men, not only in the general culture of the Great Houses but also in the more positively-portrayed Fremen culture. Indeed, the subservience of women is even worse in Fremen culture, where they're treated like property and where being killed by a woman is a sign of shame. Again, Herbert deserves some credit for doing better than a lot of 1960s fiction, but the sexism fairy has still been at work here. None of these flaws change the fact that Dune is a masterpiece. Herbert brings together history, world building, ecology, politics, and a compelling coming-of-age story about a messiah figure into a fast-paced, sweeping epic with a thoroughly satisfying conclusion. I think they do make it a flawed masterpiece, but it's still one of those SF novels that everyone should read at least once. Sadly, it's also a masterpiece that I think has suffered from its own success in the form of sequels, prequels, and a ton of supporting material. This is one of the problems that truly excellent world building can lead to. Human history is fractal: any specific detail can be examined in more depth and will usually lead (provided that information is available at all) to even more fascinating detail. The best world building conveys that impression of depth. That's what Herbert achieves here with hints, notes, and asides: the sense that galactic history is a vast ediface with the same fractal complexity as real human history. It makes for a compelling background, but it also inspires people to dig into that background and flesh out all of the details the way that we do with human history. But this doesn't actually work; invented history created by one person simply cannot be fractal in the same way. Human history is endlessly complex because it was generated by the complex interactions of many people. Invented history is an illusion that hints at complexity by building the same surface, but one mind, or even a small number of minds, cannot generate the same depth. The result is that if one digs too deep, one removes that convincing surface and ends up with a mundane, simplistic, and unsatisfyingly fake set of events. I think that's what's happened with all of the supporting material that's been written around Dune since its original publication. Dune is of a piece, a single story that's deeply enjoyable on its own terms and leaves the reader with a satisfying impression of complexity. The systemic excavation of that complexity lessens it and reveals too much of the illusion. Yes, I want to know more about the Butlerian Jihad, but that's the point: the wanting is the sign of succesful crafting of imagined history. Reading the definitive account is more likely to leave me unsatisfied than to lead to the recursive curiosity that human history can create. The sequels to Dune written by Herbert himself are, for me, another matter. Some reviewers level the same criticism at them: that Herbert dives too far into background best left unexplored. But they have the advantage of moving forward, telling more of the story set off by Paul, and the end of Dune is a clear setup for a sequel. One of Paul's goals throughout most of the book has been left unaccomplished. I don't think Herbert dove too deep into his creation; rather, my problem with his sequels (all of which I've read, although it's been some years now) is that he took the story in a direction that I actively disliked and found painful to read. Regardless, the general consensus is that the sequels aren't as good as the original, and while Dune doesn't fully resolve its story, it's complete enough that it's possible to stop here. Stopping is the general recommendation, although I still may re-read and review the sequels at some point. Followed by Dune Messiah. Rating: 8 out of 10

27 May 2012

Ingo Juergensmann: DaviCal and Addressbook Sync

After exchanging my rusted Nokia N97 against an iPhone I was in need to setup calendar and addressbook syncing again. Addressbook syncing wasn't possible with N97 anyways, or I haven't found out how to do it. Previously I synced my N97 by using iSync, but iSync doesn't sync anymore with iPhone, although iPhone now syncs with iTunes. Weird? Yes. But that's how it works. The iPhone syncs now via WLAN instead of Bluetooth, which is an improvement, but I don't really want to fire up iTunes everytime I want to sync my calendar or addressbook. And using iCloud is really not an option as well, because of privacy concerns. I'm a big fan of selfhosting and already have a running DaviCal instance running on my server. DaviCal is a great piece of software from Debian maintainer Andrew McMillan, who is doing a survey on Davical, so there's, of course, a Debian package for it. Anyway, one problem with OSX and addressbook sync via carddav is that it is not working out of the box with Addressbook.app on OSX, although the documentation in the DaviCal wiki is quite useful. When you try to enter a new account in Addressbook.app the sync will not work. The solution can be found on the private blog of Harald Nikolisin, which is in German. He writes (German, English translation follows)
Mac OS X Adressbuch anschliessen
Oh ja wenn man mittels SSL drauzugreift, dann gibts Probleme.
Im der Applikation Adressbuch kann man zwar ein CardDAV Account anlegen bei dem man die Authorisierungsdaten und den kompletten Serverpfad (s.o.) eingeben kann, man l uft aber immer auf eine Fehlermeldung hinaus.
Die L sung ist, zweimal Create anzuklicken um den fehlerhaften Account anzulegen. Dann editiert man manuell folgende Datei: ~/Library/Application Support/AddressBook/Sources/UNIQUE-ID/Configuration.plist
Dort tr gt man unter Server String die komplette URL ein.
https://SERVERNAME/davical/caldav.php/USERNAME/contacts
Am besten modifiziert man noch das Feld HaveWriteAccess auf den Wert auf 1
English translation: 
Connecting Mac OS X addressbook
Oh, yes - there are problems when accessing via SSL.
In Addressbook.app you can add a CardDAV account where you can define authentication and 
server path, but you'll always get an error message.
The solution is to click twice on "Create" in order to create the faulty entry.
Then you can edit the following file:
~/Library/Application Support/AddressBook/Sources/UNIQUE-ID/Configuration.plist
There you enter your complete URL under Server String.
https://SERVERNAME/davical/caldav.php/USERNAME/contacts 
It's best to modify the field HaveWriteAccess to the value "1"
After following this advice my Addressbook.app did successfully stored the contacts into DaviCals CardDAV from where I can sync with my iPhone. Maybe Andrew want to include this to the DaviCal wiki, maybe I'll do this myself by registering in the Wiki for that purpose... Oh, and I forgot: Using the Roundcube plugin from graviox is working nice as well with DaviCals CardDAV!
Kategorie:

5 November 2010

Vincent Sanders: Keeping kindling dry

I, along with a great number of people I know, now posses a 3rd generation Kindle. It seems Amazon have found a feature set and price point which makes this device a winning solution.

My bookshelf complete with covered kindle
I did look at a huge number of alternatives like the Sony PRS600 and others but they were all more expensive than the 110 for the Kindle and did not have enough features to make a compelling argument for spending more.

Yes it has DRM. Yes it "only" supports PDF, MOBI and mp3. Yes it will not win any style or usability awards. But I went into this eyes open the device is "good enough".

The device lets me read books from a reasonable display. The integration with amazon.com is so seamless it poses a serious danger to my bank account. I should expand on that last point :-) Amazon have got the whole spending money for a book thing executed so well that you do not think twice about a couple of pounds here and there, this soon adds up. I have set myself a rigid budget.

My main complaints are really just niggles:

  • Another different USB connector! Wahh, I thought everyone had agreed on mini USB? seems that I now have to have yet another lead for micro USB

  • The commercial book selection is a bit limited and missing a surprising number of popular titles. Some of this appears to be the publishers and authors simply clinging to their old business model. I fear some of them might not survive and early indications are they are behaving like the music industry did...Guys you are selling an infinite good a scarcity model is going to fail!

  • The price of some of the books is absurd...they are asking hardback prices for the electronic edition! Seriously? how on earth can that possibly be justified? I can see that a hardback book with its print run could cost 5 per physical item (going from hulu print on demand prices as a worst case) plus shipping and stocking fees. So how can you possibly justify charging the same price for a pile of bits where none of that applies? Also the pile of bits cannot be lent or sold, not impressed.

  • eBook formatting is generally dreadful. I do not know who is mastering these books but they need to do a better job. If they tried to pull this in the physical editions they would get a seriously large number of returns.

  • I still have to pay for whispernet delivery fees even though, because its the wi-fi model, I am providing the bandwidth myself. I can see that differentiating between 3G and wi-fi delivery is a bit hard for them though.
However my one and only real complaint with the offering as a whole is the astronomical asking price for the leather cover. The cover is currently 25% of the price of the kindle itself! ( 30 cover 110 kindle) which is just silly. It is a pretty nice cover and the clever clip attachment means it does offer an integrated solution to protecting your kindle, but not 30 nice.

Kindle in a sock cover
So my lovely wife (her kindle was bought with the cover) made me a sock for mine. This is great for casual round the house usage to stop me scuffing the screen but was a bit lightweight for protecting the kindle when out and about.

One day last week I had an idea. I would make my own protective cover by crafting something I had wanted to do for ages. And the (unoriginal I am sure) project of a hollowed out book for housing my kindle was implemented.


My hollowed out book kindle cover
A quick Google later and I had a set of plausible instructions to follow. I used possibly the most out of date book ever (published 1981) on electronic test equipment, partly because it was a ex library sell off book which cost 10pence back in 1995 but mainly because it was the right size to just enclose the kindle without adding to much size.

I learnt a couple of things doing this:
  • Do not let your pva (white) glue mix get too runny, you want it fluid enough to be easily absorbed but not watery - this is important because otherwise the paper absorbs too much water and crinkles
  • Do not use a book where the binding has gone bad already and select a "clean" book. The spine of this book was yellowed and cracking before I started. This means the book spine simply cracks open at the hollowed out bit and it is very obvious.
  • Work out where the "solid" part at the back is going to be and treat that separately so you get a nice solid base at the back of the hole. In mine its not all stuck together and is a bit wavy. Do be sure you left enough depth for the kindle though.
  • Take your time and be careful with the glue, it is amazing how obvious even a simple splash of glue in the wrong place is. Use a small brush for this a paint brush is fast but sloppy.
  • Measure carefully and cut only a few pages at a time, it takes a bit longer but looks much better. Also I did not drill the corners of my hole which means they are a little scruffy.
  • Use the sharpest thinnest knife you can, this really helps. I started with a small stanley knife but switching to my hobby scalpel gave much better results.
  • If you have some, use woodworking clamps to clamp a bit of timber (I had some offcuts of shelving) around the book to compress it while the glue dries. Do not clamp the spine if you can avoid it. This method ensures:
    1. Heavy things do not fall off the book while it dries.
    2. An even strong pressure is applied.
    3. The book does not warp or bend while the glue dries
All in all I kinda like the results and I think I will try again with a more modern book where the spine is not so broken to begin with.

15 October 2010

Enrico Zini: Award winning code

Award winning code Me and Yuwei had a fun day at hhhmcr (#hhhmcr) and even managed to put together a prototype that won the first prize \o/ We played with the gmp24 dataset kindly extracted from Twitter by Michael Brunton-Spall of the Guardian into a convenient JSON dataset. The idea was to find ways of making it easier to look at the data and making sense of it. This is the story of what we did, including the code we wrote. The original dataset has several JSON files, so the first task was to put them all together:
#!/usr/bin/python
# Merge the JSON data
# (C) 2010 Enrico Zini <enrico@enricozini.org>
# License: WTFPL version 2 (http://sam.zoy.org/wtfpl/)
import simplejson
import os
res = []
for f in os.listdir("."):
    if not f.startswith("gmp24"): continue
    data = open(f).read().strip()
    if data == "[]": continue
    parsed = simplejson.loads(data)
    res.extend(parsed)
print simplejson.dumps(res)
The results however were not ordered by date, as GMP had to use several accounts to twit because Twitter was putting Greather Manchester Police into jail for generating too much traffic. There would be quite a bit to write about that, but let's stick to our work. Here is code to sort the JSON data by time:
#!/usr/bin/python
# Sort the JSON data
# (C) 2010 Enrico Zini <enrico@enricozini.org>
# License: WTFPL version 2 (http://sam.zoy.org/wtfpl/)
import simplejson
import sys
import datetime as dt
all_recs = simplejson.load(sys.stdin)
all_recs.sort(key=lambda x: dt.datetime.strptime(x["created_at"], "%a %b %d %H:%M:%S +0000 %Y"))
simplejson.dump(all_recs, sys.stdout)
I then wanted to play with Tf-idf for extracting the most important words of every tweet:
#!/usr/bin/python
# tfifd - Annotate JSON elements with Tf-idf extracted keywords
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
import sys, math
import simplejson
import re
# Read all the twits
records = simplejson.load(sys.stdin)
# All the twits by ID
byid = dict(((x["id"], x) for x in records))
# Stopwords we ignore
stopwords = set(["by", "it", "and", "of", "in", "a", "to"])
# Tokenising engine
re_num = re.compile(r"^\d+$")
re_word = re.compile(r"(\w+)")
def tokenise(tweet):
    "Extract tokens from a tweet"
    for tok in tweet["text"].split():
        tok = tok.strip().lower()
        if re_num.match(tok): continue
        mo = re_word.match(tok)
        if not mo: continue
        if mo.group(1) in stopwords: continue
        yield mo.group(1)
# Extract tokens from tweets
tokenised = dict(((x["id"], list(tokenise(x))) for x in records))
# Aggregate token counts
aggregated =  
for d in byid.iterkeys():
    for t in tokenised[d]:
        if t in aggregated:
            aggregated[t] += 1
        else:
            aggregated[t] = 1
def tfidf(doc, tok):
    "Compute TFIDF score of a token in a document"
    return doc.count(tok) * math.log(float(len(byid)) / aggregated[tok])
# Annotate tweets with keywords
res = []
for name, tweet in byid.iteritems():
    doc = tokenised[name]
    keywords = sorted(set(doc), key=lambda tok: tfidf(doc, tok), reverse=True)[:5]
    tweet["keywords"] = keywords
    res.append(tweet)
simplejson.dump(res, sys.stdout)
I thought this was producing a nice summary of every tweet but nobody was particularly interested, so we moved on to adding categories to tweet. Thanks to Yuwei who put together some useful keyword sets, we managed to annotate each tweet with a place name (i.e. "Stockport"), a social place name (i.e. "pub", "bank") and a social category (i.e. "man", "woman", "landlord"...) The code is simple; the biggest work in it was the dictionary of keywords:
#!/usr/bin/python
# categorise - Annotate JSON elements with categories
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
# Copyright (C) 2010  Yuwei Lin <yuwei@ylin.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
import sys, math
import simplejson
import re
# Electoral wards from http://en.wikipedia.org/wiki/List_of_electoral_wards_in_Greater_Manchester
placenames = ["Altrincham", "Sale West",
"Altrincham", "Ashton upon Mersey", "Bowdon", "Broadheath", "Hale Barns", "Hale Central", "St Mary", "Timperley", "Village",
"Ashton-under-Lyne",
"Ashton Hurst", "Ashton St Michael", "Ashton Waterloo", "Droylsden East", "Droylsden West", "Failsworth East", "Failsworth West", "St Peter",
"Blackley", "Broughton",
"Broughton", "Charlestown", "Cheetham", "Crumpsall", "Harpurhey", "Higher Blackley", "Kersal",
"Bolton North East",
"Astley Bridge", "Bradshaw", "Breightmet", "Bromley Cross", "Crompton", "Halliwell", "Tonge with the Haulgh",
"Bolton South East",
"Farnworth", "Great Lever", "Harper Green", "Hulton", "Kearsley", "Little Lever", "Darcy Lever", "Rumworth",
"Bolton West",
"Atherton", "Heaton", "Lostock", "Horwich", "Blackrod", "Horwich North East", "Smithills", "Westhoughton North", "Chew Moor", "Westhoughton South",
"Bury North",
"Church", "East", "Elton", "Moorside", "North Manor", "Ramsbottom", "Redvales", "Tottington",
"Bury South",
"Besses", "Holyrood", "Pilkington Park", "Radcliffe East", "Radcliffe North", "Radcliffe West", "St Mary", "Sedgley", "Unsworth",
"Cheadle",
"Bramhall North", "Bramhall South", "Cheadle", "Gatley", "Cheadle Hulme North", "Cheadle Hulme South", "Heald Green", "Stepping Hill",
"Denton", "Reddish",
"Audenshaw", "Denton North East", "Denton South", "Denton West", "Dukinfield", "Reddish North", "Reddish South",
"Hazel Grove",
"Bredbury", "Woodley", "Bredbury Green", "Romiley", "Hazel Grove", "Marple North", "Marple South", "Offerton",
"Heywood", "Middleton",
"Bamford", "Castleton", "East Middleton", "Hopwood Hall", "Norden", "North Heywood", "North Middleton", "South Middleton", "West Heywood", "West Middleton",
"Leigh",
"Astley Mosley Common", "Atherleigh", "Golborne", "Lowton West", "Leigh East", "Leigh South", "Leigh West", "Lowton East", "Tyldesley",
"Makerfield",
"Abram", "Ashton", "Bryn", "Hindley", "Hindley Green", "Orrell", "Winstanley", "Worsley Mesnes",
"Manchester Central",
"Ancoats", "Clayton", "Ardwick", "Bradford", "City Centre", "Hulme", "Miles Platting", "Newton Heath", "Moss Side", "Moston",
"Manchester", "Gorton",
"Fallowfield", "Gorton North", "Gorton South", "Levenshulme", "Longsight", "Rusholme", "Whalley Range",
"Manchester", "Withington",
"Burnage", "Chorlton", "Chorlton Park", "Didsbury East", "Didsbury West", "Old Moat", "Withington",
"Oldham East", "Saddleworth",
"Alexandra", "Crompton", "Saddleworth North", "Saddleworth South", "Saddleworth West", "Lees", "St James", "St Mary", "Shaw", "Waterhead",
"Oldham West", "Royton",
"Chadderton Central", "Chadderton North", "Chadderton South", "Coldhurst", "Hollinwood", "Medlock Vale", "Royton North", "Royton South", "Werneth",
"Rochdale",
"Balderstone", "Kirkholt", "Central Rochdale", "Healey", "Kingsway", "Littleborough Lakeside", "Milkstone", "Deeplish", "Milnrow", "Newhey", "Smallbridge", "Firgrove", "Spotland", "Falinge", "Wardle", "West Littleborough",
"Salford", "Eccles",
"Claremont", "Eccles", "Irwell Riverside", "Langworthy", "Ordsall", "Pendlebury", "Swinton North", "Swinton South", "Weaste", "Seedley",
"Stalybridge", "Hyde",
"Dukinfield Stalybridge", "Hyde Godley", "Hyde Newton", "Hyde Werneth", "Longdendale", "Mossley", "Stalybridge North", "Stalybridge South",
"Stockport",
"Brinnington", "Central", "Davenport", "Cale Green", "Edgeley", "Cheadle Heath", "Heatons North", "Heatons South", "Manor",
"Stretford", "Urmston",
"Bucklow-St Martins", "Clifford", "Davyhulme East", "Davyhulme West", "Flixton", "Gorse Hill", "Longford", "Stretford", "Urmston",
"Wigan",
"Aspull New Springs Whelley", "Douglas", "Ince", "Pemberton", "Shevington with Lower Ground", "Standish with Langtree", "Wigan Central", "Wigan West",
"Worsley", "Eccles South",
"Barton", "Boothstown", "Ellenbrook", "Cadishead", "Irlam", "Little Hulton", "Walkden North", "Walkden South", "Winton", "Worsley",
"Wythenshawe", "Sale East",
"Baguley", "Brooklands", "Northenden", "Priory", "Sale Moor", "Sharston", "Woodhouse Park"]
# Manual coding from Yuwei
placenames.extend(["City centre", "Tameside", "Oldham", "Bury", "Bolton",
"Trafford", "Pendleton", "New Moston", "Denton", "Eccles", "Leigh", "Benchill",
"Prestwich", "Sale", "Kearsley", ])
placenames.extend(["Trafford", "Bolton", "Stockport", "Levenshulme", "Gorton",
"Tameside", "Blackley", "City centre", "Airport", "South Manchester",
"Rochdale", "Chorlton", "Uppermill", "Castleton", "Stalybridge", "Ashton",
"Chadderton", "Bury", "Ancoats", "Whalley Range", "West Yorkshire",
"Fallowfield", "New Moston", "Denton", "Stretford", "Eccles", "Pendleton",
"Leigh", "Altrincham", "Sale", "Prestwich", "Kearsley", "Hulme", "Withington",
"Moss Side", "Milnrow", "outskirt of Manchester City Centre", "Newton Heath",
"Wythenshawe", "Mancunian Way", "M60", "A6", "Droylesden", "M56", "Timperley",
"Higher Ince", "Clayton", "Higher Blackley", "Lowton", "Droylsden",
"Partington", "Cheetham Hill", "Benchill", "Longsight", "Didsbury",
"Westhoughton"])
# Social categories from Yuwei
soccat = ["man", "woman", "men", "women", "youth", "teenager", "elderly",
"patient", "taxi driver", "neighbour", "male", "tenant", "landlord", "child",
"children", "immigrant", "female", "workmen", "boy", "girl", "foster parents",
"next of kin"]
for i in range(100):
    soccat.append("%d-year-old" % i)
    soccat.append("%d-years-old" % i)
# Types of social locations from Yuwei
socloc = ["car park", "park", "pub", "club", "shop", "premises", "bus stop",
"property", "credit card", "supermarket", "garden", "phone box", "theatre",
"toilet", "building site", "Crown court", "hard shoulder", "telephone kiosk",
"hotel", "restaurant", "cafe", "petrol station", "bank", "school",
"university"]
extras =   "placename": placenames, "soccat": soccat, "socloc": socloc  
# Normalise keyword lists
for k, v in extras.iteritems():
    # Remove duplicates
    v = list(set(v))
    # Sort by length
    v.sort(key=lambda x:len(x), reverse=True)
# Add keywords
def add_categories(tweet):
    text = tweet["text"].lower()
    for field, categories in extras.iteritems():
        for cat in categories:
            if cat.lower() in text:
                tweet[field] = cat
                break
    return tweet
# Read all the twits
records = (add_categories(x) for x in simplejson.load(sys.stdin))
simplejson.dump(list(records), sys.stdout)
All these scripts form a nice processing chain: each script takes a list of JSON records, adds some bit and passes it on. In order to see what we have so far, here is a simple script to convert the JSON twits to CSV so they can be viewed in a spreadsheet:
#!/usr/bin/python
# Convert the JSON twits to CSV
# (C) 2010 Enrico Zini <enrico@enricozini.org>
# License: WTFPL version 2 (http://sam.zoy.org/wtfpl/)
import simplejson
import sys
import csv
rows = ["id", "created_at", "text", "keywords", "placename"]
writer = csv.writer(sys.stdout)
for rec in simplejson.load(sys.stdin):
    rec["keywords"] = " ".join(rec["keywords"])
    rec["placename"] = rec.get("placename", "")
    writer.writerow([rec[row] for row in rows])
At this point we were coming up with lots of questions: "were there more reports on women or men?", "which place had most incidents?", "what were the incidents involving animals?"... Time to bring Xapian into play. This script reads all the JSON tweets and builds a Xapian index with them:
#!/usr/bin/python
# toxapian - Index JSON tweets in Xapian
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
import simplejson
import sys
import os, os.path
import xapian
DBNAME = sys.argv[1]
db = xapian.WritableDatabase(DBNAME, xapian.DB_CREATE_OR_OPEN)
stemmer = xapian.Stem("english")
indexer = xapian.TermGenerator()
indexer.set_stemmer(stemmer)
indexer.set_database(db)
data = simplejson.load(sys.stdin)
for rec in data:
    doc = xapian.Document()
    doc.set_data(str(rec["id"]))
    indexer.set_document(doc)
    indexer.index_text_without_positions(rec["text"])
    # Index categories as categories
    if "placename" in rec:
        doc.add_boolean_term("XP" + rec["placename"].lower())
    if "soccat" in rec:
        doc.add_boolean_term("XS" + rec["soccat"].lower())
    if "socloc" in rec:
        doc.add_boolean_term("XL" + rec["socloc"].lower())
    db.add_document(doc)
db.flush()
# Also save the whole dataset so we know where to find it later if we want to
# show the details of an entry
simplejson.dump(data, open(os.path.join(DBNAME, "all.json"), "w"))
And this is a simple command line tool to query to the database:
#!/usr/bin/python
# xgrep - Command line tool to query the GMP24 tweet Xapian database
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
import simplejson
import sys
import os, os.path
import xapian
DBNAME = sys.argv[1]
db = xapian.Database(DBNAME)
stem = xapian.Stem("english")
qp = xapian.QueryParser()
qp.set_default_op(xapian.Query.OP_AND)
qp.set_database(db)
qp.set_stemmer(stem)
qp.set_stemming_strategy(xapian.QueryParser.STEM_SOME)
qp.add_boolean_prefix("place", "XP")
qp.add_boolean_prefix("soc", "XS")
qp.add_boolean_prefix("loc", "XL")
query = qp.parse_query(sys.argv[2],
    xapian.QueryParser.FLAG_BOOLEAN  
    xapian.QueryParser.FLAG_LOVEHATE  
    xapian.QueryParser.FLAG_BOOLEAN_ANY_CASE  
    xapian.QueryParser.FLAG_WILDCARD  
    xapian.QueryParser.FLAG_PURE_NOT  
    xapian.QueryParser.FLAG_SPELLING_CORRECTION  
    xapian.QueryParser.FLAG_AUTO_SYNONYMS)
enquire = xapian.Enquire(db)
enquire.set_query(query)
count = 40
matches = enquire.get_mset(0, count)
estimated = matches.get_matches_estimated()
print "%d/%d results" % (matches.size(), estimated)
data = dict((str(x["id"]), x) for x in simplejson.load(open(os.path.join(DBNAME, "all.json"))))
for m in matches:
    rec = data[m.document.get_data()]
    print rec["text"]
print "%d/%d results" % (matches.size(), matches.get_matches_estimated())
total = db.get_doccount()
estimated = matches.get_matches_estimated()
print "%d results over %d documents, %d%%" % (estimated, total, estimated * 100 / total)
Neat! Now that we have a proper index that supports all sort of cool things, like stemming, tag clouds, full text search with complex queries, lookup of similar documents, suggest keywords and so on, it was just fair to put together a web service to share it with other people at the event. It helped that I had already written similar code for apt-xapian-index and dde before. Here is the server, quickly built on bottle. The very last line starts the server and it is where you can configure the listening interface and port.
#!/usr/bin/python
# xserve - Make the GMP24 tweet Xapian database available on the web
#
# Copyright (C) 2010  Enrico Zini <enrico@enricozini.org>
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
import bottle
from bottle import route, post
from cStringIO import StringIO
import cPickle as pickle
import simplejson
import sys
import os, os.path
import xapian
import urllib
import math
bottle.debug(True)
DBNAME = sys.argv[1]
QUERYLOG = os.path.join(DBNAME, "queries.txt")
data = dict((str(x["id"]), x) for x in simplejson.load(open(os.path.join(DBNAME, "all.json"))))
prefixes =   "place": "XP", "soc": "XS", "loc": "XL"  
prefix_desc =   "place": "Place name", "soc": "Social category", "loc": "Social location"  
db = xapian.Database(DBNAME)
stem = xapian.Stem("english")
qp = xapian.QueryParser()
qp.set_default_op(xapian.Query.OP_AND)
qp.set_database(db)
qp.set_stemmer(stem)
qp.set_stemming_strategy(xapian.QueryParser.STEM_SOME)
for k, v in prefixes.iteritems():
    qp.add_boolean_prefix(k, v)
def make_query(qstring):
    return qp.parse_query(qstring,
        xapian.QueryParser.FLAG_BOOLEAN  
        xapian.QueryParser.FLAG_LOVEHATE  
        xapian.QueryParser.FLAG_BOOLEAN_ANY_CASE  
        xapian.QueryParser.FLAG_WILDCARD  
        xapian.QueryParser.FLAG_PURE_NOT  
        xapian.QueryParser.FLAG_SPELLING_CORRECTION  
        xapian.QueryParser.FLAG_AUTO_SYNONYMS)
@route("/")
def index():
    query = urllib.unquote_plus(bottle.request.GET.get("q", ""))
    out = StringIO()
    print >>out, '''
<html>
<head>
<title>Query</title>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"></script>
<script type="text/javascript">
$(function() 
    $("#queryfield")[0].focus()
 )
</script>
</head>
<body>
<h1>Search</h1>
<form method="POST" action="/query">
Keywords: <input type="text" name="query" value="%s" id="queryfield">
<input type="submit">
<a href="http://xapian.org/docs/queryparser.html">Help</a>
</form>''' % query
    print >>out, '''
<p>Example: "car place:wigan"</p>

<p>Available prefixes:</p>

<ul>
'''
    for pfx in prefixes.keys():
        print >>out, "<li><a href='/catinfo/%s'>%s - %s</a></li>" % (pfx, pfx, prefix_desc[pfx])
    print >>out, '''
</ul>
'''
    oldqueries = []
    if os.path.exists(QUERYLOG):
        total = db.get_doccount()
        fd = open(QUERYLOG, "r")
        while True:
            try:
                q = pickle.load(fd)
            except EOFError:
                break
            oldqueries.append(q)
        fd.close()
        def print_query(q):
            count = q["count"]
            print >>out, "<li><a href='/query?query=%s'>%s (%d/%d %.2f%%)</a></li>" % (urllib.quote_plus(q["q"]), q["q"], count, total, count * 100.0 / total)
        print >>out, "<p>Last 10 queries:</p><ul>"
        for q in oldqueries[:-10:-1]:
            print_query(q)
        print >>out, "</ul>"
        # Remove duplicates
        oldqueries = dict(((x["q"], x) for x in oldqueries)).values()
        print >>out, "<table>"
        print >>out, "<tr><th>10 queries with most results</th><th>10 queries with least results</th></tr>"
        print >>out, "<tr><td>"
        print >>out, "<ul>"
        oldqueries.sort(key=lambda x:x["count"], reverse=True)
        for q in oldqueries[:10]:
            print_query(q)
        print >>out, "</ul>"
        print >>out, "</td><td>"
        print >>out, "<ul>"
        nonempty = [x for x in oldqueries if x["count"] > 0]
        nonempty.sort(key=lambda x:x["count"])
        for q in nonempty[:10]:
            print_query(q)
        print >>out, "</ul>"
        print >>out, "</td></tr>"
        print >>out, "</table>"
    print >>out, '''
</body>
</html>'''
    return out.getvalue()
@route("/query")
@route("/query/")
@post("/query")
@post("/query/")
def query():
    query = bottle.request.POST.get("query", bottle.request.GET.get("query", ""))
    enquire = xapian.Enquire(db)
    enquire.set_query(make_query(query))
    count = 40
    matches = enquire.get_mset(0, count)
    estimated = matches.get_matches_estimated()
    total = db.get_doccount()
    out = StringIO()
    print >>out, '''
<html>
<head><title>Results</title></head>
<body>
<h1>Results for "<b>%s</b>"</h1>
''' % query
    if estimated == 0:
        print >>out, "No results found."
    else:
        # Give as results the first 30 documents; also use them as the key
        # ones to use to compute relevant terms
        rset = xapian.RSet()
        for m in enquire.get_mset(0, 30):
            rset.add_document(m.document.get_docid())
        # Compute the tag cloud
        class NonTagFilter(xapian.ExpandDecider):
            def __call__(self, term):
                return not term[0].isupper() and not term[0].isdigit()
        cloud = []
        maxscore = None
        for res in enquire.get_eset(40, rset, NonTagFilter()):
            # Normalise the score in the interval [0, 1]
            weight = math.log(res.weight)
            if maxscore == None: maxscore = weight
            tag = res.term
            cloud.append([tag, float(weight) / maxscore])
        max_weight = cloud[0][1]
        min_weight = cloud[-1][1]
        cloud.sort(key=lambda x:x[0])
        def mklink(query, term):
            return "/query?query=%s" % urllib.quote_plus(query + " and " + term)
        print >>out, "<h2>Tag cloud</h2>"
        print >>out, "<blockquote>"
        for term, weight in cloud:
            size = 100 + 100.0 * (weight - min_weight) / (max_weight - min_weight)
            print >>out, "<a href='%s' style='font-size:%d%%; color:brown;'>%s</a>" % (mklink(query, term), size, term)
        print >>out, "</blockquote>"
        print >>out, "<h2>Results</h2>"
        print >>out, "<p><a href='/'>Search again</a></p>"
        print >>out, "<p>%d results over %d documents, %.2f%%</p>" % (estimated, total, estimated * 100.0 / total)
        print >>out, "<p>%d/%d results</p>" % (matches.size(), estimated)
        print >>out, "<ul>"
        for m in matches:
            rec = data[m.document.get_data()]
            print >>out, "<li><a href='/item/%s'>%s</a></li>" % (rec["id"], rec["text"])
        print >>out, "</ul>"
        fd = open(QUERYLOG, "a")
        qinfo = dict(q=query, count=estimated)
        pickle.dump(qinfo, fd)
        fd.close()
    print >>out, '''
<a href="/">Search again</a>

</body>
</html>'''
    return out.getvalue()
@route("/item/:id")
@route("/item/:id/")
def show(id):
    rec = data[id]
    out = StringIO()
    print >>out, '''
<html>
<head><title>Result %s</title></head>
<body>
<h1>Raw JSON record for twit %s</h1>
<pre>''' % (rec["id"], rec["id"])
    print >>out, simplejson.dumps(rec, indent=" ")
    print >>out, '''
</pre>
</body>
</html>'''
    return out.getvalue()
@route("/catinfo/:name")
@route("/catinfo/:name/")
def catinfo(name):
    prefix = prefixes[name]
    out = StringIO()
    print >>out, '''
<html>
<head><title>Values for %s</title></head>
<body>
''' % name
    terms = [(x.term[len(prefix):], db.get_termfreq(x.term)) for x in db.allterms(prefix)]
    terms.sort(key=lambda x:x[1], reverse=True)
    freq_min = terms[0][1]
    freq_max = terms[-1][1]
    def mklink(name, term):
        return "/query?query=%s" % urllib.quote_plus(name + ":" + term)
    # Build tag cloud
    print >>out, "<h1>Tag cloud</h1>"
    print >>out, "<blockquote>"
    for term, freq in sorted(terms[:20], key=lambda x:x[0]):
        size = 100 + 100.0 * (freq - freq_min) / (freq_max - freq_min)
        print >>out, "<a href='%s' style='font-size:%d%%; color:brown;'>%s</a>" % (mklink(name, term), size, term)
    print >>out, "</blockquote>"
    print >>out, "<h1>All terms</h1>"
    print >>out, "<table>"
    print >>out, "<tr><th>Occurrences</th><th>Name</th></tr>"
    for term, freq in terms:
        print >>out, "<tr><td>%d</td><td><a href='/query?query=%s'>%s</a></td></tr>" % (freq, urllib.quote_plus(name + ":" + term), term)
    print >>out, "</table>"
    print >>out, '''
</body>
</html>'''
    return out.getvalue()
# Change here for bind host and port
bottle.run(host="0.0.0.0", port=8024)
...and then we presented our work and ended up winning the contest. This was the story of how we wrote this set of award winning code.

4 February 2010

Sylvain Le Gall: FOSDEM 2010

Last year, I was not able to attend FOSDEM due to last minute problems. However, this year I will be there and even attend the Debian for some periods ([http://wiki.debian.org/DebianEvents/FOSDEM/2010]). I will bring my Openbrick NG with a standard Debian Lenny and probably a Babelbox installation. The Openbrick is a VIA C3 fanless computer. It is not very exotic but it is quite interesting to see this kind of hardware. For years, I have tried to build/use fanless computer. This is not a very popular topic but it introduces problems of heat and noise at a higher level. I am still setting up the Babelbox, which should have been a Debian Lenny RC1. I will try to upgrade it to Debian Lenny 5.0.3. See you at FOSDEM 2010. I'm going to FOSDEM, the Free and Open Source Software Developers' European Meeting

19 September 2009

Tiago Bortoletto Vaz: Single tree-like navigation topbar for Beamer


I like the Antibes and JuanLesPins themes from the Beamer class. They use a nice tree-like navigation at the top, which provides a good notion about where exactly the speaker is in her/his talk. This feature is especially important for presentations given in places where attendees don t care too much about being in time. Antibes and JuanLesPins use the tree and smoothtree outer themes. The problem for me is that they provide a 3-level tree navigation (Title, Section and Subsection), plus the Frame Title. I don t have subsections in my presentations. Also, I don t think it s necessary to have the main title in every single frame (but if you enjoy it, try using one of the nice beamer footer instead, for example the one from Madri theme). Unfortunately I couldn t find any beamer theme with Section plus Frame Title only. So, a very simple cleanup in the tree outertheme gave me what I was looking for: beamerouterthemesingletree.sty If you want to use this, just set it as the outertheme in your tex. Supposing the file name is beamerouterthemesingletree.sty and it s located in the same directory as your tex file, add: \userouthertheme singletree Posted in english

9 May 2009

Andrew Pollock: [life] A few days in British Columbia

We got back from our brief visit to Canada last night. We had a great time, Vancouver and Victoria were both very nice. YVR (Vancouver Airport) This airport has to win the award for nicest airport I've seen. Clearly they pulled out all the stops in preparation for the Winter Olympics next year. The airport is very bright, airy and open. Before you even hit the immigration line, you pass through a couple of water features, some indigenous art, all of that jazz. Immigration is done in the open in a huge hall, instead of in the dark bowels of the airport, which was a pleasant change from the usual US airport immigration experience. (That said, Houston airport had a pretty nice immigration experience). Oh, and it was blanketed with free WiFi, which was a nice touch. Ferry to Victoria The brief geography lesson I got was that Victoria is on Vancouver Island, and Vancouver (the city) is not on Vancouver (the island). Go figure. Getting to Victoria from the airport via BC Ferries was a very seamless experience. You board a bus from the side of the airport terminal, and it drives to Tsawwassen, drives onto the ferry, and then you get off the bus and go sit on one of the passenger levels of the ferry. BC Ferries BC Ferries deserves special mention. The ferry itself is huge - 7 decks (three of them for vehicles). It's a veritable shopping mall on water. It's got some stores, buffets, exclusive $10 entry lounges, video arcades, cellular payphones, the works. The trip to Victoria The ferry ride took about 1.5 hours, and was very scenic. We passed by a few smaller islands, and I had an instant flashback to the TV show The Beachcombers that I used to watch as a kid, particularly the opening sequence. Victoria Victoria is a beautiful city. It's the capital of the province of British Columbia, and I likened it to Canberra in a lot of ways. It had more modern conveniences than a town with a population of its size would otherwise normally have. It was a government and university town. Nice houses, very green in general, and being an island, there was water everywhere. It seemed to have lots of bays, and harbours and rivers. It was very nice. Flight back to Vancouver We took a Harbour Air seaplane back to Vancouver on Sunday afternoon. That was fun. I've never been on a seaplane before. It took about 35 minutes. Vancouver First order of business in Vancouver was the purpose of the whole trip: renewing our visas. That went fine, I'll write a separate post about the process. Afterwards, we bought tickets for the Big Bus, and toured the city for the remainder of Monday and Tuesday. Drive to Whistler For Wednesday and Thursday, we hired a car, and went a bit further afield. On Wednesday, we drove up to Whistler to check it out. The drive up took about 5 hours, because we stopped at every scenic spot we came by. Waterfalls abounded. There was a lot of work being done on the roads between Vancouver and Whistler, upgrading them for the Olympics. Whistler itself was pretty spectacular. Some crazy looking ski runs. The village looked like Squaw Village on steroids. The drive back only took about 2 hours. Cleveland Dam, Lynn Canyon suspension bridge For the last day, we drove to Cleveland Dam, where there was a salmon hatchery, and checked out the dam, hatchery and surrounding area. It was lovely and green. After that, we drove to Lynn Canyon to check out the suspension bridge there (unlike the Capilano Suspension Bridge, which costs $26 a pop, the Lynn Canyon one is free). That was all we had time for really. The weather wasn't fantastic during our stay, but it didn't really prevent us from doing anything either. I'd have liked to have checked out Stanley Park if we had more time, and Sarah wanted to go to Grouse Mountain. We had a near-100% success rate at being picked as Australians, unlike when we're in the US, where more often than not, people ask us if we're British. Apparently this is mainly attributed to the fact that Australians run all of the ski lifts at Whistler. We certainly heard a lot of Australian accents while we were getting around. I'd also never seen such a saturation of Starbucks before. In Vancouver, there were literally Starbucks across the road from Starbucks, and around the corner from another one. Whistler village had two. The Canadian accent is cute. They really do tack "eh?" on the end of everything, although it's phonetically more like "ay?". "House" and "about" are also pronounced distinctively differently. Overall, Vancouver wasn't what I'd call bursting at the seams with tourist attractions, but seemed like a nice city. It was fairly flat. The beggars were well dressed. Everyone was very friendly. A lot of the taxis were Priuses. It seemed pretty clean. I could handle living there. Not sure how bad it gets in winter.

30 January 2009

Jon Dowland: state of fear

I've recently read Michael Crichton's controversial "State of Fear": A book where radical environmentalists stage terror attacks to promote the idea of catastrophic climate change. I read this book as a counterbalance to a few other things I have consumed recently: Kim Stanley Robinson's excellent ecological series "40 Signs of Rain", "50 Degrees Below" and "60 Days and Counting"; as well as Al Gore's "An Inconvenient Truth". I can also kid myself that I don't exclusively read Science Fiction (It's filed under "literature!" honest!) Given these points, I was not expecting to enjoy this book. It's been a while since I last read a Crichton, so it might just be lack of familiarity with his style, but I found the beginning of the book to be a bit frustrating: he doesn't spend much time detailing the world and situations that his characters find themselves in. Some of the early characters are typical Star Trek-style red shirts who exist soley to be killed off. Eventually I either meshed gears with it pacing-wise or the main characters stopped being cardboard because I started to enjoy it. Sometimes it gets a bit preachy and I can see why climate scientists and global warming activists got so wound up about it, although I think some of the reactions are very knee-jerk. At the end of the book, Crichton has a short essay about his feelings regarding climate change, followed by some references. The message I took away from the book was not "global warming is all a farce" as many of the reviews and descriptions suggested to me, but rather, evidence-based science is crucial and don't take things for granted: including either stance on climate change, two points I strongly agree with. There are some other points he raises which I think are really good ideas and hope to write more about soon.

15 January 2009

Julien Blache: Antec PSU suck

How not to start your day, from the all-hardware-sucks department. For years, I ve been struggling to keep my machines quiet. I ve miserably failed for years, and pretty much gave up once I realised that the quiet parts I was looking for just did not exist. Last year, I finally found what I was looking for: I ve been very happy with this hardware for both my workstation and my filer. Well, until today. No need to mention that this hardware is not exactly cheap, but the price is right for the quality. So, wake up today, do random things, grab my TomTom, unplug it from the workstation it s sucking its power from, do things, plug it back in the USB port so it won t draw its battery. This is how it went: Check machine temp, PSU temp, attempt to restart PSU, check UPS, check cables, replace power cable and bypass UPS. Admit defeat for this round and accept the PSU as dead. Pull out the machine, rip out the fscking damn PSU. Grab the tools, crack the PSU open. Notice bulging and leaking capacitors. Yes, this PSU is anything but cheap, yet it uses el-cheapo chinese capacitors. Antec, you suck, big time. For the past decade everybody in the industry has known that el-cheapo chinese capacitors manufactured after 1999 are total crap, using an electrolyte that isn t actually one because its formula was stolen by industrial spies who got totally PWNED and ended up stealing a bad formula. (If you need a reference, google for singing capacitors . Note that singing capacitors can also happen with good parts used in a crappy electronic design. Like that 100 Mbps D-Link switch over there, or that Sony-Ericsson mobile phone charger. Hmm. Not everybody can hear it, I do.) A healthy, young capacitor should not bulge nor leak. A capacitor that leaks, bulges or sings is a crappy capacitor and needs to be replaced with a quality part ASAP. I ve been routinely replacing such capacitors on my older motherboards (manufactured between 1999 and 2001) in the past years. I ve not had the problem in ANY of my el-cheapo PSU, and did not replace any capacitor in any of them to date. Fortunately, the machine did not suffer any further damage, nor did the TomTom. Everything is up and running again with a spare PSU. I m now going to replace the fscking capacitors on this PSU, replace the fuse, probably R&R a couple of high-voltage transistors too (they have a tendency to fry before the fuse blows out) and get it back up and running. Then I ll do the same with the other PSU in the filer, because that machine has 8 disks attached, so draws more power and hence is much more vulnerable. Fuck you Antec.

30 September 2008

Axel Beckert: Mini-ITX based Home Server: Planning and Hardware

Ever since my former desktop machine gsa died and I started using only laptops at home, I noticed a need for a home server for storing all my MP3s, holiday pictures, games, and backups of my other machines. And I also want a filtering web proxy at home again. Current situation Currently my Norhtec MicroClient Jr. “c2” with it’s 120 GB 2.5" harddisk does some of these jobs (mostly storage and backup), but it neither has the disk space nor the performance to do all the things I want. For storage I once bought a TheCus N4100, the big brother of the popular and officially Debian supported N2100. Unfortunately there are a few things different than in the N2100 (NIC without MAC) which makes it much more difficult to get Debian on it and the original firmware doesn’t support NFS at all. *grmpf* I had hints from others who managed to get Debian on this NAS, but I didn’t find the time and leisure to really dig into cross-compiling kernels. (Although with the new 1.3.06 firmware, so called modules became possible also for the N4100 and a SSH module has been posted with which a Debian chroot could be installed and the required kernel build on the machine itself.) I though wasn’t very angry when the N4100+ came out shortly after I bought the N4100, because the N4100+ was no more an ARM based device but had a Celeron processor inside instead. And a NAS which is built on average PC hardware wasn’t as appealing as some device based on some more exotic architecture mainly used in embedded devices. :-) The Mini-ITX Appeal This view changed rapidly, when Raffzahn showed me a few Mini-ITX boards and cases. I surfed around on Mini-ITX.com store and stumbled upon the NAS-like ES34069 case from Chenbro featuring four S-ATA hotswap 3.5" slots, a slim-line CD-ROM drive slot, a SD card reader, and enough space for an additional 2.5" hard disk and a low profile Mini-ITX board. Additionally, the VIA EPIA SN series of Mini-ITX boards sports 4 S-ATA ports and either a passively cooled 1 GHz C7 processor or an actively cooled 1.8 GHz C7 processor. That should be enough power for a small multi-purpose home server while still keep the power consumption low. And I’m not the only one having this idea, Mini-ITX.com suggests this combination and Chenbro officially supports the VIA EPIA SN boards. Additionally, Debian 5.0 Lenny seems to run fine on the SN series, only lm-sensors seems to have problems with SN18000G and SN10000EG (but not SN18000 and SN10000E). So when the Chenbro ES34069 case showed up in digitec’s online shop, I ordered one there and a VIA EPIA SN18000G board at Brack. I didn’t order any disks since for data storage I plan to use the four Samsung 400 GB 3.5" S-ATA disks I bought for the N4100, and for the system I plant to use the 2.5" disk I initially bought for my MicroClient JrSX “c1”, but then continued to use it only with the CF card. Not yet sure, if I’ll also equip the slim-line optical drive slot, too. The case took several weeks to deliver and the mainboard hasn’t arrived yet. Instead I got an e-mail from Brack that VIA products are currently very difficult to get in Switzerland. Reason is said to be that VIA tries to channel the distribution of their products to a single distributor. (Sounds somehow similar to what Apple tried with the iPhone and failed.) Mini-ITX boards and power consumption So I now have a nice case without a board. There aren’t that many Mini-ITX boards out there sporting 4 S-ATA ports. One which cleary stood out was the new Intel DG45FC Mini-ITX board with LGA775 socket. (In Switzerland neither available at Brack nor at digitec, but e.g. at PCP.) But reading the specs of this board it was also clear that it wasn’t thought for NAS systems but high-performance HTPCs — the focus seems to be on multimedia performance which a NAS doesn’t need. Its newer sister, the Intel DQ45EK Mini-ITX board is focussed more on office and business PCs than on multimedia. But Intels remote adminstration is not really a plus for me (don’t need it, I’ve got SSH ;-) and it’ neither cheaper than the DG45FC nor has it significantly lower power-consuption. Despite the 120W power-supply there are people who already combined the Chenbro ES34069 with the Intel DG45FC or DQ45EK board, e.g. one of the administrators of the German NAS-Portal forums built such a machine and this German guy who wants to build a Windows Home Server based on such a combination. At least the NAS-Portal administrator found out that the board consumes so much power that together with the 4 S-ATA disks the included 120W power supply doesn’t suffice and the system is not stable in this configuration. Trusted Reviews review of the DG45FC explains why: It’s one of the first Mini-ITX board not following the MoDT idea, has a desktop chipset instead a mobile chipset and therefore hasn’t all of the power-saving features of those mobile chipsets. But it’s easy to see anyway: Most of the CPUs supported by the DG45FC and DQ45EK boards have a TDP of 65W. Offically the processor cooler delivered with the case supports processors with up to 65W, but 65W is already more than the half of what the power supply delivers and according to the Trusted Reviews review, the board itself consumes another 35W itself. So for the four 3.5" S-ATA disks — which are usually not as economical as notebook disks — about 20W are left. This can’t work! The guy from NAS-Portal.org plans to solve the problem by using a universal 180W notebook power supply instead of the original one. In comparison to the 100W of the both Intel boards, VIA’s SN18000G consumes only 26W (the fanless SN10000EG even only 22W) and that’s board and processor! That’s about ¼ of what the Intel board consumes. Imagine the difference between having a 100W light bulb (suffices for a whole small room) shining 365 days a year compared to a 25W light bulb (often used in bedside lamps) in a year. Other Mini-ITX mainboards with 4x S-ATA include the following ones: Conclusion For now, I decided to wait a little bit more for my VIA EPIA SN18000G board which still seems to be the best board for the Chenbro ES34069 case although not really cheap. But if I once in a not that distant future decide to have a desktop at home again, I’m quite sure it’ll sport a cute Mini-ITX case (perhaps a nice black-orange HFX micro M1 case by mCubed — unfortunately the M2 is no more available in a color combination including orange ;-) with an Intel DG45FC or Kontron 986LCD-M/mITX and a decent Core 2 Duo processor. Software Plans Of course my home server will run Debian GNU/Linux 5.0 Lenny on it, with software RAID-5 and LVM2 over the 1.6 TB of S-ATA disks resulting in 1.2 TB available disk space which will be offered using at least NFS, SMB and SSH (think sshfs). Planned software includes BackupPC (a very fine pulling backup system for machines which are not online 24/7) and Privoxy. I’ll perhaps also install Tor and a caching proxy like Squid or Polipo. Another idea is to run Mediatomb on that machine. :-)

26 April 2008

Craig Sanders: Keeping apt archives empty or not.

In this post, Steven Hanley wonders how to keep /var/cache/apt/archives empty on a machine that has a full mirror of debian on it. That’s actually the wrong question. The right question is ‘Why is /var/cache/apt/archives even relevant on a machine with a full mirror of debian?’ /etc/apt/sources.list has always supported file:/ URLs. for example, sources.list on my local debian mirror looks like this:
deb file:/home/ftp/debian unstable main contrib non-free
deb file:/home/ftp/debian-multimedia/ unstable main
With a file URL, apt-get doesn’t need to download any packages, and thus doesn’t need to cache anything in /var/cache/apt/archives. No need to keep it empty, it just is empty. In a related post, Andrew Pollock suggests adding DPkg::Post-Invoke “apt-get clean”; ; to /etc/apt/apt.conf. That would be fine if you ONLY ever ran “apt-get update ; apt-get dist-upgrade”. It’s not fine if you run “apt-get -d” to download packages first, then install a few packages with “apt-get install package”, and then run “apt-get dist-upgrade”. If you try that, then the DPKG::Post-Invoke rule will clean out /var/cache/apt/archives when you install the individual package(s). When you then run the dist-upgrade, apt-get will have to download all the packages again. Better just to get into the habit of running apt-get clean or autoclean occasionally when you want to reclaim some disk space. Syndicated from Craig Sanders' Errata: Miscellaneous Thoughts Keeping apt archives empty…or not.

22 April 2008

Andrew Pollock: [debian] On keeping /var/cache/apt/archives empty

Steven Hanley was pondering how to keep /var/cache/apt/archives empty on the mirror that I help administer. This sounds vaguely similar to mounting the /usr filesystem read-write at the start of an APT run and read-only again at the end. (A practice I used to believe in, but due to various package upgrades making /usr busy for no good reason, and it artificially inflating the maximal mount-count and prematurely causing a fsck at boot, I've discontinued) So, putting
DPkg::Post-Invoke   "apt-get clean";  ;
in /etc/apt/apt.conf (or in a file in /etc/apt/apt.conf.d) ought to do the trick.

Next.

Previous.